Scaling behavior in economics: II. Modeling of company growth 
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Abstract 



G In the preceding paper we presented empirical results describing the 

t^ \ growth of publicly-traded United States manufacturing firms within the years 

1974-1993. Our results suggest that the data can be described by a scaling 
approach. Here, we propose models that may lead to some insight into these 
phenomena. First, we study a model in which the growth rate of a company 
is affected by a tendency to retain an "optimal" size. That model leads to 
an exponential distribution of the logarithm of the growth rate in agreement 
00 ' with the empirical results. Then, we study a hierarchical tree-like model of 

a company that enables us to relate the two parameters of the model to the 
exponent f3, which describes the dependence of the standard deviation of the 
distribution of growth rates on size. We find that j3 = — In IT/ In z, where z 
defines the mean branching ratio of the hierarchical tree and II is the proba- 
bility that the lower levels follow the policy of higher levels in the hierarchy. 
We also study the distribution of growth rates of this hierarchical model, 
i-rt ■ We find that the distribution is consistent with the exponential form found 

empirically. 

O 
> 

X 
b 

*Present Address: Theor. Physik III, Heinrich-Heine-Univ., D-40225 Diisseldorf, Germany. 
^Present Address: Fakultat fur Physik, Universitat Konstanz, D-78434 Konstanz, Germany. 
^Present Address: Department of Physics, MIT, Cambridge, MA 02139. 



I. INTRODUCTION 

The concept of scaling supports much of our current conceptualization on the general 
subject of how complex systems formed of interacting subunits behave. This concept was 
developed a quarter century ago by physicists interested in the behavior of a system near its 
critical point. Progress was made possible by a remarkable combination of experiment and 
phenomenological theory. In the preceding paper we presented empirical results suggesting 
that the scaling concept can be useful in describing economic systems jl],|| . In this paper we 
present models which may lead to an understanding of the underlying mechanism behind 
the scaling laws. 

In the preceding paper, we used the Compustat database to study all United States (US) 
manufacturing publicly-traded firms from 1974 to 1993. The Compustat database contains 
20 years of data on all publicly-traded companies in the US. We found that the distribution 
of firm sizes remains stable for the 20 years we study, i.e., the mean value and standard 
deviation remain approximately constant. We studied the distribution of sizes of the "new" 
companies in each year and found it to be well approximated by a log-normal. However, 
we find (i) the distribution of the logarithm of the growth rates, for a growth period of one 
year, and for companies with approximately the same size So displays an exponential form 

m 

p(r ' |So) = ts^t-v^^H • (1) 

and (ii) the fluctuations in the growth rates — measured by the width of this distribution 
o\ — scale as a power law [[§], 

ai{S ) ~ So-* . (2) 

Here T\ = ln(Si/S ), where Si is the size of the company in the next year, and cri(So) is 
the standard deviation (width) of the distribution (|J). We found that the exponent (3 takes 
the same value, within the error bars, for several measures of the size of a company. In 
particular, we obtained: j3 = 0.20 ± 0.03 for "sales." 

In this paper, we present and discuss models that, although very simple, give some 
insight into these empirical results. The paper is organized as follows. In Sect. II, we 
discuss a model that predicts an exponential distribution of growth rates. In Sect. Ill, we 
study a hierarchical tree model that predicts the power law dependence of <7\ on size. In 
Sect. IV, we discuss how the two models can be combined so that a single model predicts 
both of our central empirical findings. In Sect. V, we summarize our findings and suggest 
avenues for future research. The paper contains three appendices. Appendix A discusses the 
relationship between the standard deviations of the growth rate and the logarithmic growth 
rate. Appendices B and C give more details of the analytical solution of the hierarchical 
tree model. 

II. THE EXPONENTIAL DISTRIBUTION OF GROWTH RATES 

As described above, one of our central findings is that the distribution of growth rates for 
companies of a given initial size has an exponential form. The result is surprising because 



the sales of organizations as large as publicly traded corporations reflect a large number of 
factors. While those factors are not necessarily independent and while the growth of any one 
company might be dominated by a single factor, one might nonetheless expect a Gaussian 
distribution for growth rates. 

In this section, we show how a plausible modification of Gibrat's assumptions || could 
lead to Eq. (1). We relax the assumption of uncorrelated growth rates and assume that 
the successive growth rates are correlated in such a way that the size of a company is 
"attracted" to an optimal size S*. This value is reminiscent of the minimum point of a "U- 
shaped" average cost curve in conventional economic theory and should evolve only slowly 
in time (on the scale of years) . 

Let us then consider a set of companies all having initial sales Sq. As time passes, the 
sales of each of the firms varies from day to day (or over another time interval much less 
than 1 year), but tend to stay in the neighborhood of S*. In the simplest case, the growth 
process has a constant "back-drift," i.e. 

St+At fk(l + et), S t <S* 



S f )la + e t ), S t >S* (3) 



where k is a constant larger than one and e* is an uncorrelated Gaussian random number 
with zero mean and variance a\ <C 1. These dynamics are similar to what is known in 
economics as regression towards the mean @||, although this formulation is not standard 
in economics. 

Written in terms of the logarithmic growth rate r t = ln(St/So), Eq. @ reads 

n+At - n = - In k sgn(r t - r*) + ln(l + e t ), (4) 

where r* = ln(S* /Sq) and sgn x — — 1 for x < and sgn x — 1 for x > 0. Since a e ^C 1, we 
can write ln(l + e t ) — e t . 

For large times t ^> At we can replace Eq. Q) by its continuum limit and obtain 

At drQ = - ]nk lL\r(t)-r*\ + y/A&e(t), (5) 

at dr 

where now e(t) is a Gaussian random field with (e(t)) = and (e(t)e(t')) = <y^S(t — t') 
0. Here, (• ■ ■) means an average over realizations of the disorder and 5 is the Dirac delta 
function. Equation @ describes a strongly overdamped Brownian motion of a classical 
particle with mass one in a potential 

V(r) = Ink \r — r*\, (6) 

where the friction constant is At and the thermal energy is o\j1 [ID| . For large times t ^> At 



(e.g., after one year), the "particle coordinate" r is distributed according to the equilibrium 
Boltzmann distribution, 

In A; / 21nfc|ri-r*|\ ,„. 

Kri|s ) = ^-expl ~ 2 I. (7) 



Hence, we recover Eq. ([!]) with r(so) = r* and 



V 2 



"^ = T^Tk- (8) 



III. THE SCALING EXPONENT f3 

While the model in the previous section explains Eq. (|l|), it does not predict our finding 
about the the power law dependence of the standard deviation of growth rates on firm size. 
In this section, we show how a model of management hierarchies can predict Eq. (fj). In 
economics, it is generally presumed that the growth of firms is determined by changes in 
demand and production costs. Since these features are specific to individual markets, it is 
surprising that a law as simple as equation Eq. (f2|) governs the growth rate of firms operating 
in much different markets. While demand and technology vary across markets, virtually all 
firms have a hierarchical decision structure. One possible explanation for why there is a 
simple law that governs the growth rate of all manufacturing firms is that the growth process 



is dominated by properties of management hierarchies UTTJ . This focus on the technology of 
management rather then technology of production as a basis for understanding firm growth 
is reminiscent of Lucas' model of the size distribution of firms |12| . 

At the outset let us acknowledge a tension between our empirical results and the the- 
oretical model in this section. In our companion paper and in the preceding section, we 
analyze the scaling properties of the distribution of the logarithmic growth rate r\ and its 
standard deviation <j\. In this section we view companies as consisting of many business 
units. Since the sales of a company are the sum of the sales of individual units rather than 
their product, it is more convenient to analyze the standard deviation of the annual firm size 
change rather then the logarithmic growth rate. Let S 1 (5'o) be the standard deviation of 
end-of-period size for initial size Sq. Since o\ ~ So -/3 and since Si = Soexp(ri) ps So + SqVi 
, it follows that £i(So) ~ SofTi ~ So 1- ' 9 . As discussed in Appendix A, o\ must be small for 
this approximation to hold. 

A. Definition of the model 

Let us start by assuming that every company, regardless of its size, is made up of similarly 
sized units. Thus, a company of size So is on average made up of N = So/£ units, where 

- 1 N 

iv i=l 

and £j is the size of unit i. We further assume that the annual size change Si of each unit 
follows a bounded distribution with zero mean and variance A, which is independent of 
So- It is important to notice that throughout this section and the following we consider 
A <C £ 2 , to insure that sizes of units remain positive. Since some divisions after several 
cycles of growth may shrink almost to zero, while others grow several times, we assume 
that companies dynamically reorganize themselves so that they begin each period with 
approximately equal-sized divisions and the inequality A <C £ 2 holds. 

If the annual size changes of the different units are independent, then the model is trivial. 
Using the fact that (Si) = 0, we have 

N 

(Si) = s +J2( s i) = s °- ( 10 ) 



The second moment of the distribution is given by 

\ \ 1=1 / / i=l J = l 

= So 2 + ATA, 

where we used again the fact that the <Vs are centered and independent. 
Thus, the variance in the size of the company is 

Z 1 2 (S )=NA = S j~S . (12) 

Using the fact that S(5 ) ~ So 1-/3 (see Appendix A), it follows that (3 = 1/2. 

The much smaller value of (3 that we find indicates the presence of strong positive corre- 
lations among a company's units. We can understand this result by considering the tree-like 



hierarchical organization of a typical company JTlJ . The head of the tree represents the head 



of the company, whose policy is passed to the level beneath, and so on, until finally the units 
in the lowest level take action. These units have again a mean size of £ = Sq/N and annual 
size changes with zero mean and variance of A. Here we assume for simplicity that at every 
level other than the lowest each node is connected to exactly z units in the next lowest level. 
Then the number of units N is equal to z n , where n is the number of levels (see Fig. p]). 

What are the consequences of this simple model? Let us first assume that the head of 
the company suggests a policy that could result in changing the size of each unit in the 
lowest level by an amount Sq. If this policy is propagated through the hierarchy without 
any modifications, then it is the same as assuming in Eq. (|L2|) that all the Si 8 are identical. 
This implies that 

SA = So 2 + N 2 A, (13) 



from which follows 

S 1 2 (5 )=iV 2 A = ^|, (14) 

and we conclude that (3 = 0. 

Of course, it is not realistic to expect that all decisions in an organization would be 
perfectly coordinated as if they were all dictated by a single "boss." Hierarchies might 
be specifically designed to take advantage of information at different levels; and mid-level 
managers might even be instructed to deviate from decisions made at a higher level if they 
have information that strongly suggests that an alternative decision would be superior. 
Another possible explanation for some independence in decision-making is organizational 
failure, due either to poor communication or disobedience. 
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FIG. 1. The hierarchical-tree model of a company. We represent a company as a branching 
tree. Here, the head of the company makes a decision about the change 5q in the size of the lowest 
level units. That decision is propagated through the tree. However, the decision is only followed 
with a probability n. This is pictured in the figure as a full link. With probability (1 — II) a new 
growth rate is defined. This is pictured as a slashed link. We see that at the lowest level there are 
clusters of values Si for the changes in size. 

B. Analytical calculations 

To model the intermediate case between (3 — and (3 = 1/2, let us assume that the head 
of a company makes a decision to change the size of the units of a company by an amount 
5q- We also assume that 5q, for the set of all companies, has zero mean and variance A. 
Furthermore, we consider that each manager at the nodes of the hierarchical tree follows 
his supervisor's policy with a probability II, while with probability (1 — II) imposes a new 
independent policy. The latter case corresponds to the manager acting as the head of a 
smaller company made up of the units under his supervision. Hence the size of the company 
becomes a random variable with a standard deviation that can be computed either with 
numerical simulations or using recursion relations among the levels of the tree. 

Since the calculations are somewhat involved, we include them in Appendix B for the 
interested reader (see also Refs. |l3|Jl^| ). The main result is that the variance of the fluctu- 



ations in a n-level hierarchical tree is given by 

-IT 2 

~zTP ' l " I -zTP 



SiV) = A [f±-^ - (zUr^^-) . (15) 



If zll 2 > 1, then (zll) 2n dominates the growth, and we get 

Ei 2 (n) ~ (zli) 2n ~ N 2 U 2lnN/lnz ~ N 2 N 2lnIl/lnz ~ s 2+2lnIl/lnz , 



(16) 



which implies /3 = — In IT/ In z. On the other hand, if zll 2 < 1, then z n = N is the dominant 



term, and we obtain 



£! 2 (n) ~ z n ~ N ~ S , 



(17) 



which implies /3 = 1/2. 

Finally, we can write, for n ^> 1, that the hierarchical model leads to 

/ ,_ J-lnn/ln^ if n> ^ 1/2 , 

1 - \ i/r, :(n . ..-1/2 



1/2 



if n < z - 



(18) 



Even for small n, we find that Eq. (|I^) is a good approximation — e.g., while for z = 2 and 

II = 0.87 we predict j3 = 0.20, when we take n = 3 the deviation from the predicted value 

is only 0.03, i.e., about 15%. 
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FIG. 2. Phase diagram of the hierarchical-tree model. To each pair of values of (II, z) corre- 
sponds a value of (3. We plot the iso-curves corresponding to several values of (3. In the shaded 
area, marked "Uncorrelated," the model predicts that (3 = 1/2, i.e., that the units of the company 
are uncorrelated. Our empirical data suggests that most companies have values of II and z in close 
to the curve for (3 = 0.2. 



Equation fll8|) is confirmed in the two limiting cases: when II = 1 (absolute control) 
/3 = 0, while for all II < l/z 1 ^ 2 , decisions at the upper levels of management have no 
statistical effect on decisions made at lower levels, and (3 = 1/2. Moreover, for a given value 
of j3 < 1/2 the control level II will be a decreasing function of z: IT = z~@, cf. Fig. |2j. For 
example, if we choose the empirical value (3 ~ 0.15, then Eq. fll8"|) predicts the plausible 
result 0.9 > IT > 0.7 for a range of z in the interval 2 < z < 10. 

IV. COMBINING THE TWO MODELS 

We started with two central empirical findings about firm growth rates. The model in 
Section II predicts one of those findings (the shape of the distribution) and the model in 
Section III predicts the other (the power law dependence of the standard deviation of output 
on firm size). This section addresses the relationship between the two models. First, we 
address concerns that the models might be contradictory and show that they are not. Then, 
we show how the models can be combined into a single model that predicts both of our 
empirical findings. 

In the tree model, firm growth rates are potentially the result of many independent 
decisions. As a result, one might expect that the Central Limit Theorem would imply a 
Gaussian distribution of firm output. In fact, however, the distribution of outputs is not 
necessarily Gaussian. 

To address the distribution of firm output in the tree model, it is necessary to make an 
assumption about the distribution from which each independent growth decision is drawn. 
No such assumption is needed to analyze the standard deviation of firm growth rates, but 
is needed to analyze the shape of the distribution. 

In Fig. 3, we show the distribution of the inputs (i.e., of each independent decision) 
and the outputs for a tree with z = 2, II = 0.87, and n = 10. We find that for Gaussian 
distributed inputs, the output is not Gaussian in the tails. This finding is remarkable. First 
of all, with z = 2 and n = 10, the firm consists of 1024 units. With a probability to disobey 
of 1 — 0.87 = 0.13, one would expect 0.13 x 1024 w 133 of the units to, on average, make 
independent decisions about their growth rates. Thus, even for non-Gaussian inputs, one can 
hypothesize that the output is close to Gaussian. Moreover, for Gaussian inputs, the sum 
of independent Gaussians is itself Gaussian. Thus, for every particular configuration of the 
disobeying links, the output distribution is Gaussian with variance mA, which is a function 
of this random configuration. However, there are 2^ z " ~ z >/y z ~ 1 ) possible configurations of 
links each of which produce a Gaussian distribution with different integer m. 

PniS,) = E^^J—e-^) 2 /^ (19) 

which is no longer Gaussian for the observed form of p". 
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Fig. 3(c) 
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FIG. 3. Probability density for the output and input variables in the tree model. Here we have 
z = 2, II = 0.87, and n = 10. (a) Gaussian distribution of the input, (b) Exponential distribution 
of the input, (c) Data collapse of the output distribution for trees with different number of levels n. 
The other parameters remain unchanged and the input is exponentially distributed. It is visually 
apparent the similarity of the numerical results with the empirical data of Fig. 4(b) Q . 

Figure 4 shows the probability p 1 ^ to get a tree with given m computed for all trees 
with a given number of levels n, IT = 0.87, and z — 2. As visually apparent in Fig. 4, this 
probability density is a non-trivial function, which is discussed in more detail in Appendix 
C. The final distribution of the firm output Si will be thus given by the convolution of two 
densities: p"^ and Gaussian with variance mA 

In a general case, it can be shown by martingale theory |I5 that for any input distribution 
f(x) with zero mean and finite variance A, the output distribution converges for n — >• oo to 
a distribution 



x 



:9f 



Ei(ny J VEi(n) 



(20) 



where #/ is a function that does not depend on n but depends on /. Thus, we cannot 
expect to obtain a result that the output distribution must be exponential regardless of the 
input distribution. It would, however, be desirable to find some simple input distribution 
that yields the output distribution that we actually observe. Figure 3 also shows the output 
distribution when the input distribution is exponential in terms of Si — So- For small G\, it 
practically coincides with Eq. (0). In this case, the output distribution is nearly exponential, 
and the slightly fatter wings that we observe are arguably consistent with our empirical 
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results. Thus, in the limit of small <7i, we can combine the models of the two sections by 
assuming that the dynamic process described in Sect. II provides the input distribution 
for the tree model in Sect. III. This additional assumption in the tree model then predicts 
both of our empirical findings. For large a±, the direct combination of two models needs 
additional fine-tuning. 

Fig. 4 
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FIG. 4. Numerical estimation from exact enumeration of the coefficients p™ of the generating 
function p n (s). It is visually apparent that the coefficients scale according to Eq. (B20) even for n 
as small as 6. This result suggests that the companies have a self-similar structure. 

V. CONCLUSIONS 

The two central results of our previous paper are that the distribution of company growth 
rates is exponential and the standard deviation of growth rates scales as a power law of firm 
size with scaling exponent (3 ~ —0.2. Any realistic theory of the firm in economics must 
be consistent with these empirical findings. In this paper, we have presented simple models 
that are consistent with our empirical findings. Indeed, the models have only two key 
assumptions. One is that each company has a natural size and the other is that decisions in 
hierarchical organizations are positively but imperfectly correlated. These models suggest 
that very simple mechanisms may provide insight into our empirical findings. 

One limitation of the model in this paper is that it only predicts our results about 
one year growth rates. A complete model of the firm would also predict the distribution 
of growth rates over longer horizons. We believe that extending this model to additional 
periods would not provide a complete description of firm dynamics. In reality, the standard 
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deviation of growth rates goes up as the time horizon increases. The attraction in our model 
to a stable company size prevents the distribution from spreading over time as much as we 
actually observe. 
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APPENDIX A: WIDTH OF THE DISTRIBUTION OF FINAL SIZES 

The theory in Section III establishes results about the standard deviation of the growth 
rate. Our empirical results in the earlier paper concerned the standard deviation of the 
logarithmic growth rate defined as \r(Si/Sq). This appendix establishes the exact relation- 
ship between the standard deviation of the growth rate and the standard deviation of the 
logarithmic growth rate. Thus we will compute the width of the distribution of final sizes 
Si = Soexpri, that we designate by S 1 (S , ). We can express S x as 

£i(S ) 2 = (S 1 2 ) - (SJ 2 . (Al) 

Taking ri(so) ~ 0, and assuming that the standard deviation of the distribution is small 
(o"i < l/-\/2 which holds for companies with sales larger than 10 6 dollars,) simple integrations 
lead to 



and 



(Si) = I Sx p( ri \S ) dn = - — U - Y -, (A2) 



x> C 2 

Si 2 ) = I Si 2 P (n\S ) d n = °° (A3) 

3 1 — l(J\ Z 



Replacing these results onto (Al) and expanding in Taylor series, we obtain 



S^Sq) 2 = S 2 (l + 2a! 2 + 4a! 4 + ■ ■ • - 1 - ai 2 - 3^74 + • 

^(S a 1 ) 2 (l + 13ai 2 /4). (A4) 

Thus, to first order, we obtain 

E 1 (So)^S o 1 - . (A5) 
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APPENDIX B: ANALYTICAL CALCULATION OF THE VARIANCE OF THE 
GROWTH RATE FOR THE HIERARCHICAL-TREE MODEL 



This appendix provides a rigorous derivation of Eq. fll5|) Let, as before, S\ represent the 
final size of a company with initial size So, and assume that the company has n levels in 
its hierarchical tree. According to the rules of the model, the decision of the head of the 
company will only be followed by those units in the bottom level which are connected to 
the top by a chain of managers with "obeying links." Thus, the number of units of the 
company that follow the policy of the head of the company T n can be related to the well 
known problem of the number of male descendents of a family after n generations [TBI . The 



solution is that for a n-level tree with z branches the average number of units at the end is 
given by 

(T n ) = (zU) n . (Bl) 

Now, let us look at the problem of calculating Si (So). Our problem is slightly more 
complicated since it includes double averaging over all realizations of growth rates of inde- 
pendent units and over all possible configurations of the tree. Let us look at the n th level of 
a tree with a certain configuration of obeying and disobeying links. We can define clusters 
of units connected to one another through obeying links. Let us suppose that there are M n 
distinct clusters of size z/j. According to the rules of the model, all units in cluster % share 
the same value of the annual change Si. Thus, he final size of the company will be 

Mn 

Si = So + J>A, (B2) 

i=i 

where Si are independent random variables with zero mean and variance A. 

The variance in Si, for a given tree with n levels, can be obtained by averaging over all 
realizations of & 



"i ■ 



M n 

A n = A^z/ J 2 = m„A, (B3) 

i=i 

where m n is a random variable depending solely on the structure of the tree 

m n = $>f. (B4) 

To obtain Si 2 we need now to average over all possible configurations of the hierarchical 
tree 

Si(n) 2 = A(m n ). (B5) 

In order to calculate (m n ), we will start by computing the conditional average value 
(m n )| m _ , where m n _i refers to the previous level on the tree. A cluster of size z/j in 
the (n — 1) level is connected to zvi units in the n-level; v[ of the links are obeying, while 

13 



[zVi — v[) are disobeying. The obeying links will give rise to a cluster of size v[ in level n, 
while the disobeying links give rise to {zv^ — v'^) clusters of size one. Thus, we have 

M n _i 

m n = J2 ( u ? + ( zu i ~ v 'i)) 

A-/„-i M n _i 

/2 



/2 

i=l 

The probability of a configuration with a 1/ obeying links is 



E (^ - ^) + * £ ^ 

4=1 4=1 

EW 2 -^)+^- (B6) 



r7)n^(i-n) 2 ^'. (B7) 

By averaging over all possible configurations of links, we obtain 

(m n )\ m ^ = if ( £ (*? V(l - H)— 4(*f - */)) + *». (B8) 

i=i [^=o V ^ / J 

The series in ( B8|) can be calculated with one of the traditional "tricks." Defining q — 1 — II, 

k = zu, and j = v[, we have 

E (*) (f ~ WO- - n) fc - = n 2 |^(n + q y\ u+9=1 

= k{k-l)U 2 . (B9) 

Replacing this result into (p8|), we obtain 

(m fc )| m „_ 1 = (zn) 2 E ^ 2 -n 2 ^ E ^ + ^ n 

8=1 8=1 

= ( 2 n) 2 m n _! + (i - n 2 )z n - (bio) 

Hence (m n ) satisfies the recursion relation 

(m n ) = ( 2 n) 2 (m«-i> + (1 - n 2 )z", (m ) = 1. (Bll) 

Writing the first few terms in the succession and induction show that 

n-l 

(m n ) = (zu) 2 + (i - n/V E^n 2 )'- ( B12 ) 

i=0 

Replacing the geometric series by its value and simple calculations lead to 



Replacing this result into Eq. flBli|), we get 

S iW 2 ^A^i^-(,nr^-^j. (B14) 



^2 A L „i-n 2 ^ m2n (*-i)rF 
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APPENDIX C: DISTRIBUTION OF THE OUTPUT VARIABLE FOR THE 

HIERARCHICAL-TREE MODEL 

In this appendix we will derive the dependence of the variance of the distribution of 
growth rates for the hierarchical-tree model in a more formal way. At the same time we 
will get some insight onto the distribution of the number of end units that are connected 
by obeying links to the head of the tree. We will concentrate on the case in which the 
distribution of inputs is Gaussian. 

Let us look at the the n-th level of the tree: We can define clusters of units which are 
connected to one another, in the tree, through obeying links. Thus, they share the same 
value of the annual size change. Supposing there are M n distinct clusters with sizes i/$, we 
have 

N = z n = J £v i . (CI) 

%=x 

Since there is a set of possible tree structures for any given value of II (and n and z), we 
should consider the set of all possible values of M n (1 < M n < N). Let us then denote 
the set of all partitions of N into different clusters as $, and each of these partitions as fa. 
Naturally, the sum of the probabilities of each partition P(fa) verifies 

1 = EW (C2) 

It is known that for large values of N, the number of different partitions behaves as 
l/(v / 48A^) exp(7r v /iV73) @. 

Let us denote the probability density of the input variable S as f(6). The probability 
density for the output of a cluster of s units connected by obeying links is 

f{x = s5) = - s f (£) . (C3) 

Thus, the distribution of the output variable S = J2j i x j is given by 

p„(s) = £ ± / (£) . 1 / m . ... . -1- / (us) p(«. (C4) 

££si \siJ s 2 \s 2 J s Mz \s M J 

where g(y = x 1 + x 2 ) = f{x x ) * f(x 2 ) = J f(xi)f(y - x 1 )dx 1 . 

If 5 is assumed to be Gaussian distributed with zero mean and unit variance: 

f(8) = -L**p(-t), (C5) 



/2tt 
then the convolution leads to 

g{x 1 +x 2 ) = /(xi) * f(x 2 ) 
1 




(C6) 
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Replacing this result onto (|C4|) , we obtain 



Pn (S) =E /I, 2 / / m, J ' ( C7 ) 



A simple analysis of Eq. ( |C7| ) shows that any two partitions, 0j and 0&, are equivalent in 
terms of their output distributions if they verify 

Mi M k 

£«i = £*i = ™- (cs) 

On the other hand, the triangular (or Schwarz) inequality allows us to determine the possible 
number of partitions that are not equivalent because of the constraints on the value of s 

N Mi (Mi \ 2 

^ = £i 2 <£^ 2 < £* =n\ (C9) 



Equations (|C8| - |C9|) imply that the sum in (|C7|) over different partitions can be replaced 



by a sum over m. Thus, we can write asymptotically 

(CIO) 





PniS) =E J T=f (4=) 


and, finally, 






N 2 n 

m=N V2nm 



(Cll) 

where p 7 ^ is the total probability of all equivalent partitions with given n and m. 

The standard way to calculate the coefficients p m is to introduce a generating function 

m 



N- 



Pn(S) = £ p n m S m , (C12) 

m=N 

which is a polynomial of order iV 2 . To obtain the recursion relations for p n (S), we need to 
distinguish the cluster of units which is connected to the top of the tree from those clusters 
that are not. For each level n we have a matrix of coefficients pij. that characterizes the 
probability of the partition with the cluster of I elements connected to the head of the tree 
and the sum of squares of the rest of the cluster sizes equal to k. Thus, we can look at the 
tree as made of two parts, the one connected to the top, with size £, and the remaining of 
size (N — £) . Here we introduce the full generating function 

N (N-t) 2 

Pn(y,S) = Y: £ Pl k y e $\ (C13) 

e=o k=N-e 
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where m = £ 2 + k. 

The reduced generating function p n (S) can be obtained from the full generating function 
p n (y, S) if one formally puts y l = S e in Eq. ( |C13| ). In order to obtain the recursion relation 



for the full generating function, let us consider a tree with n+1 levels as z trees connected by 
another level of branches to the top. If a n— level tree is connected to the top by a disobeying 
link, which happens with probability (1 — II), its clusters are totally independent of the other 
branches and we can use the reduced generating function p n (S). If, however, a n— level tree 
is connected to the top by a obeying link, which happens with probability II, its clusters 
merge with the clusters of other such trees, and the full generating function p n (y, S) must 
be used. Thus, the generating function of level n + 1 is related to the generating function 
of level n through the recursion relation 

p n+1 (y, S) = (Up n (y, S) + (l- U)p n (S)) z . (C14) 

Unfortunately, to our knowledge, this recursion relation is too complex to allow any simplifi- 
cation or solution. Thus, we cannot obtain the distributions of cluster sizes for the different 
values of n. On the other hand, the problem of obtaining the average value of £ (which was 
earlier designated (T n )) and the variance E^n) 2 of the output variable is relatively simple 
0. Indeed 



d_ 

dy 

Combining ( |(J14j ) and ( |C15| ), we obtain 



(Tn) = —Pn(y,S)\ y=1 , S =l. (C15) 



0_ 

dy' 

d_ 

dy 1 
zli (T n ). (C16) 



(T n+1 ) = —p n +i(y,S)\ y=l! s=i 



zI1 —Pn(y,S)\ y=1 ,s=l 



And we recover Eq. (|B1|). The variance can also be easily obtained as [D3 

d 

d d d 

= g-y-Q-Pn(y, S)\ y=lt s=l + ggPniV, S) \y=l,S=l, (C17) 

which, after some algebra, leads to 

M = z{m n ) + z{z - l)n 2 (T„> 2 , (C18) 

which is equivalent to Eq. (pil|) . 

Although, as discussed earlier, the coefficients p™ cannot be calculated analytically, we 
can use Eq. ( |014| ) to find their values numerically (see Fig. 4). Moreover, for zll 2 > 1 the 



coefficients p^ of the reduced generating function p n (S) scale as 
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for large n. This can be proven applying the martingale theory |y|. Indeed, the sequence 



m. 



m r . 



(zU) 



2n 



+ 



n 2 



?n 2 



l- 



;^n 2 )' 



(C20) 



obeys the martingale conditions: From Eq. 



it follows that (m n ) 



\m n -i 



m 



n-l- 



It also 



can be shown that m n has limited variance for any n, and hence it follows that for large n 
the scaling relation ( |C19| ) is valid. 
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